Skip to content

Conversation

@djolertrk
Copy link
Contributor

Handle DebugVar decorator and new debug_info section in the final .masp.

@djolertrk djolertrk marked this pull request as draft December 15, 2025 07:30
@djolertrk
Copy link
Contributor Author

Goes on top of #2443.

@djolertrk
Copy link
Contributor Author

cc @greenhat @bitwalker

@djolertrk
Copy link
Contributor Author

In order to try everything within compiler, I have used the commit 81cc603c1b9d5e6a55d82daee5fc1016cee33c02 (tag: v0.19.1) as a base branch... Since the debug variable tracking (0xMiden/compiler#820) depends on this, we may want to implement majority of these things in the compiler.

cc @greenhat @bitwalker

@djolertrk djolertrk marked this pull request as ready for review January 9, 2026 17:34
/// If this is a function parameter, its 1-based index
arg_index: Option<u32>,
/// Source file location (file:line:column)
location: Option<FileLineCol>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed, as mentioned in my other comment - but let me know if this is only intended to be set in cases where the location differs from that of the associated instruction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add comment here to make it clear.

@@ -273,6 +272,7 @@ pub enum Instruction {
// ----- debug decorators --------------------------------------------------------------------
Breakpoint,
Debug(DebugOptions),
DebugVar(miden_core::DebugVarInfo),
Copy link
Collaborator

@bitwalker bitwalker Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be added as an instruction - only a decorator, just like AssemblyOp.

That said, currently, the AST/assembler doesn't provide a way to attach decorators that don't correspond to an instruction to the AST before assembly.

So - I think the best way to handle this is to support Instruction variants like DebugVar, but do not pretty-print them like instructions, and do not add support in the grammar for parsing them. The compiler will then have an easy way to emit these decorators, but they don't end up getting treated like normal instructions.

I would suggest we add a method on Instruction called has_textual_representation which matches on the instruction variant and returns true if it matches any variant other than those like DebugVar which are not present in the textual format of MASM. I'm happy to bikeshed the name too, but has_textual_representation feels like the most precise way to name it without it unintentionally becoming used for something with different semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!

/// Path to the source file (index into string table)
pub path_idx: u32,
/// Optional directory containing the file (index into string table)
pub directory_idx: Option<u32>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant with path_idx, i.e. if we have the path of the source file, then that presumably includes any directory information already, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intending to kinda match with what other debug info representations do (like DWARF or CodeView), where it tries to keep the base/common root dir shared among other relative paths depending on that root, but since in miden(c) we always operate with full paths, we can simplify this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep it, I'd make a note that when directory_idx is set, then path_idx is a relative path, otherwise path_idx is expected to be absolute.

We will probably want to explore splitting paths into those components in the future, so we can further reduce the size of the serialized debug info, but I'd treat that as lower priority for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a TODO marker for this in 7bbd988, so we can introduce these fields later.

/// Column where the variable is declared
pub column: u32,
/// Scope depth (0 = function scope)
pub scope_depth: u32,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to understand how one can leverage a non-zero scope_depth in practice, i.e. how is it used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a variable with the same name in a nested/inner block debugger needs to differentiate between that one and the one from outer scope. I will add an example in form of comment here in the code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - in the debugger we only need to care about definitions at the current depth (and above, i.e. lower depth) for purposes of being able to show variable information at the current program point. We don't need to worry about conflicts at the same depth, since only one variable definition at that depth can exist, even if there are multiple scopes with that depth in the current function.

Definitely a good idea to document those semantics here I think, but at least the usefulness is clear to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the clarification!

let process = &mut self.state();
host.on_trace(process, *id)?;
},
Decorator::DebugVar(_debug_var) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this behavior, I think I would reword this as:

DebugVar is strictly a metadata-carrying decorator used by downstream tooling, i.e. it is non-actionable in the processor itself.

That said, IMO we need some way for the debugger to be notified when variable information is found/changes, so that it can update its internal state accordingly, otherwise we will be forced to step every instruction manually to check for this decorator in case that information changes. I'd prefer not to have to do that.

IMO, we should manifest this either as a raw trace event, modify on_debug to support triggering that callback for this decorator, or add a new callback that is specific to DebugVar. Whichever approach we choose, we'd only invoke the callback when tracing is enabled in the processor.

/cc @bobbinth @plafer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to whether this decorator is the right approach for the debugger. Though with #2541 we want to move away from representing AsmOp (and now DebugVar too) as decorators that can be visited needlessly by the processor during execution.

Though for now this is the best we can do, provided that the DebugVar decorator is the right approach for the debugger.

Note: is this a feature that would only be accessible for programs written in Rust, or also from MASM? Or in other words, would there be fewer features made available by the debugger for debugging programs written in MASM compared to programs written in Rust?

Copy link
Contributor Author

@djolertrk djolertrk Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: is this a feature that would only be accessible for programs written in Rust, or also from MASM? Or in other words, would there be fewer features made available by the debugger for debugging programs written in MASM compared to programs written in Rust?

If there is a way of using a concept of "variable" at MASM level we can create a decorator for it -- since this is not Rust specific, but rather an abstraction. However, it needs to have a name, type..., so debugger can decode its value properly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plafer Since MASM doesn't have variables, this decorator wouldn't ever be produced from Miden Assembly sources. The closest we have are procedure locals, but those aren't variables in any practical sense; rather they collectively represent a fixed-size region of automatically-allocated, procedure-scoped memory, so I don't think it would make sense to use DebugVar for them - at least not without some MASM syntax for local variables that gives them a name and type, and is then used to derive the size and placement of procedure locals needed for that procedure.

But, as @djolertrk noted, this can be used for any source language with variables, it isn't Rust-specific.

On the topic of debug decorators like this and AssemblyOp, I'd be happy to replace them with something that doesn't rely on the decorator mechanism at all, but I'm not actually sure what that would be. All we really need is a way to map a MAST node to the debug information that applies to it, ideally with a way to disambiguate the debug information based on context (e.g. current call stack). AFAIK, that's essentially how decorators work now though (sans the disambiguation bit), so it isn't clear to me what we gain or lose by adding decorators like this. Definitely open to figuring out how best to handle this sort of thing going forward though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #2541 we basically already have a plan to move out any current "decorator" which only "provides metadata" (TLDR they get their own separate CSR matrix). But since that doesn't exist yet, we can just add the DebugVar as a decorator for now (as-is in this PR), and we'll pull it out along with AsmOp when we address #2541.

Copy link
Collaborator

@bitwalker bitwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty good! See my comments for more specific feedback on things I think we may need to change.

One thing that I think may be worth adding to miden_mast_package as part of this, is a higher-level type or set of types which reify the encoded representation of the debug information into usable form that has been validated and can be directly queried by tools such as the debugger.

As it stands, what you get from deserializing the section is still a very raw form (i.e. references to other data in the section is not materialized/validated). That is useful as a low-level primitive, but means that any tooling that wants to actually use the information encoded in the package has to build that high-level representation themselves. While in some cases that may be necessary/desirable, I suspect that in practice, a single canonical representation defined in miden_mast_package would be sufficient for the uses we have in mind.

My only other note is that I'd like to see a bit more documentation on what the various bits of information are used for, how to put them together. This would of course be one thing solved by providing a higher-level representation of the debug info, but even then there are some bits of information that are somewhat unclear on what their usage should be (e.g. my comment on scope_depth).

Ultimately, one of the things I'd want the high-level representation to provide, would be a set of API functions that correspond to important debugger queries, without the debugger needing to know too much about the internal encoding of the debug info provided by a Package.

@djolertrk
Copy link
Contributor Author

I think this looks pretty good! See my comments for more specific feedback on things I think we may need to change.

One thing that I think may be worth adding to miden_mast_package as part of this, is a higher-level type or set of types which reify the encoded representation of the debug information into usable form that has been validated and can be directly queried by tools such as the debugger.

As it stands, what you get from deserializing the section is still a very raw form (i.e. references to other data in the section is not materialized/validated). That is useful as a low-level primitive, but means that any tooling that wants to actually use the information encoded in the package has to build that high-level representation themselves. While in some cases that may be necessary/desirable, I suspect that in practice, a single canonical representation defined in miden_mast_package would be sufficient for the uses we have in mind.

My only other note is that I'd like to see a bit more documentation on what the various bits of information are used for, how to put them together. This would of course be one thing solved by providing a higher-level representation of the debug info, but even then there are some bits of information that are somewhat unclear on what their usage should be (e.g. my comment on scope_depth).

Ultimately, one of the things I'd want the high-level representation to provide, would be a set of API functions that correspond to important debugger queries, without the debugger needing to know too much about the internal encoding of the debug info provided by a Package.

Thanks a lot for your comments! I have rebased this on top of the latest next branch, and addressed all comments. My comment regarding API for the debugger -- I have left a comment with a TODO marker, so we can pick it up once compiler side is ready and when we can have hands-on on the consumer/debugger side. Please do note that I have added a couple of documents in the 0xMiden/compiler#820 that describe the debug info design at both IRs level and in the final package.

  - Add DebugVarInfo and DebugVarLocation
  - Add Debug info types:
    - DebugInfoSection (main container)
    - DebugTypeInfo - Type definitions (Primitive, Pointer, Array, Struct, Function, Unknown)
    - DebugPrimitiveType - 16 primitive types (Void through Word)
    - DebugFileInfo
    - DebugFunctionInfo
    - DebugVariableInfo
    - DebugInlinedCallInfo (for Inlined call site tracking)
  - debug_info/serialization.rs - Full serialization/deserialization of
    the debug info section
@djolertrk djolertrk force-pushed the feature/debug-variable-locations-clean branch 4 times, most recently from 3289e5e to 193b063 Compare January 10, 2026 14:31
@djolertrk
Copy link
Contributor Author

As you can see in the 0xMiden/compiler#745, I have back-ported these commits from miden-vm based on "next" to a branch based on "miden-vm v19.0.1" (https://github.com/walnuthq/miden-vm/tree/feature/debug-variable-locations-v0.19.1) so I can test it all together with compiler and debugger, so the latest commits here represent debugger API that is tested with 0xMiden/miden-debug#17.

@djolertrk djolertrk force-pushed the feature/debug-variable-locations-clean branch 2 times, most recently from 6774b84 to 132a8c6 Compare January 19, 2026 10:16
- Delete accidentally committed ._target file, add ._* to .gitignore
- Change DebugVarInfo.name from String to Arc<str> for sharing references
- Change DebugVarInfo.arg_index to Option<NonZeroU32>
- Change DebugVarLocation::Const to use Felt instead of u64
- Add Instruction::has_textual_representation() method for DebugVar
- Update print.rs to return Document::Empty for non-textual instructions
- Add documentation explaining location should only differ from AssemblyOp
- Add documentation for directory_idx explaining relative path usage
- Use LineNumber/ColumnNumber types in mast-package debug_info types
- Document scope_depth usage with detailed examples
- Update processor callback comments for DebugVar decorator
- Add miden-debug-types dependency to mast-package
- Simplify DebugFileInfo to use full paths only
Add a new callback mechanism for DebugVar decorators to enable debuggers
to track source-level variable information during execution.

This allows debuggers like miden-debug to receive notifications when
variable location information changes, enabling features like displaying
local variables and their values during debugging sessions.
- Changed Local(u16) to Local(i16) to store signed FMP offset directly
- The offset is computed as: idx - num_locals
- Updated Display format
- Updated serialization to handle i16
- Address is computed as: FMP + offset
@djolertrk djolertrk force-pushed the feature/debug-variable-locations-clean branch from 132a8c6 to f0ae386 Compare January 19, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants